[Bugfix] Guard mm_token_type_ids kwarg in get_mrope_input_positions#35711
[Bugfix] Guard mm_token_type_ids kwarg in get_mrope_input_positions#35711hmellor merged 4 commits intovllm-project:mainfrom
Conversation
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
There was a problem hiding this comment.
Code Review
This pull request correctly fixes a TypeError by guarding the mm_token_type_ids keyword argument before passing it to model.get_rope_index(). The fix involves inspecting the function's signature to ensure it can accept the argument. I've suggested a performance improvement to cache the result of the signature inspection, as it's a costly operation that doesn't need to be repeated on every call.
| import inspect | ||
|
|
||
| sig = inspect.signature(self.model.get_rope_index) | ||
| params = sig.parameters | ||
| if "mm_token_type_ids" in params or any( | ||
| p.kind == inspect.Parameter.VAR_KEYWORD for p in params.values() | ||
| ): | ||
| kwargs["mm_token_type_ids"] = torch.cat(mm_token_type_ids) |
There was a problem hiding this comment.
While this fix is correct, using inspect.signature on every call to get_mrope_input_positions can introduce a performance overhead, especially in a hot path. Since the signature of self.model.get_rope_index is not expected to change at runtime, this check can be performed once and the result cached. A simple way to achieve this is to use lazy initialization with hasattr to cache the check result on the first call.
if not hasattr(self, "_get_rope_index_accepts_mm_token_type_ids"):
import inspect
sig = inspect.signature(self.model.get_rope_index)
params = sig.parameters
self._get_rope_index_accepts_mm_token_type_ids = "mm_token_type_ids" in params or any(
p.kind == inspect.Parameter.VAR_KEYWORD for p in params.values()
)
if self._get_rope_index_accepts_mm_token_type_ids:
kwargs["mm_token_type_ids"] = torch.cat(mm_token_type_ids)There was a problem hiding this comment.
Fair suggestion. Done :)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
|
@hmellor can you review this? |
|
Thanks for the fix @AndreasKaratzas, I think the intention in the original PR was that I'll reproduce locally to see what options we have for fixes. |
@hmellor Would it be possible to temporarily go with this patch? We would like this test to soon be in the gating set of AMD tests :) |
hmellor
left a comment
There was a problem hiding this comment.
Yeah this will work for now! I'm excited for the Transformers modelling backend to be tested more on AMD!
|
(I cancelled the build because it looks like it needs a rebase/merge and I can't do it myself because the PR comes form an org) |
…llm-project#35711) Signed-off-by: Andreas Karatzas <akaratza@amd.com>
…llm-project#35711) Signed-off-by: Andreas Karatzas <akaratza@amd.com>
…llm-project#35711) Signed-off-by: Andreas Karatzas <akaratza@amd.com>
#35097 introduced passing
mm_token_type_idstomodel.get_rope_index()in the transformers multimodal backend. However, not all models accept this parameter. For example,Qwen2_5_VLModel.get_rope_index()does not havemm_token_type_idsin its signature nor**kwargs, causing aTypeErrorat inference time.Reproduced via:
Bisected to fd6de37.
Fix
Before passing
mm_token_type_idsas a kwarg, inspect the signature ofmodel.get_rope_index()to check whether it actually accepts the parameter (either explicitly or via**kwargs). If it doesn't, skip it.cc @kenroche